---
title: Configuring the JDBC Connector for Hive Access (Optional)
---

You can use the PXF JDBC Connector to retrieve data from Hive. You can also use a JDBC named query to submit a custom SQL query to Hive and retrieve the results using the JDBC Connector.

This topic describes how to configure the PXF JDBC Connector to access Hive. When you configure Hive access with JDBC, you must take into account the Hive user impersonation setting, as well as whether or not the Hadoop cluster is secured with Kerberos.

*If you do not plan to use the PXF JDBC Connector to access Hive, then you do not need to perform this procedure.*


## <a id="hive_cfg_server"></a>JDBC Server Configuration

The PXF JDBC Connector is installed with the JAR files required to access Hive via JDBC, `hive-jdbc-<version>.jar` and `hive-service-<version>.jar`, and automatically registers these JARs.

When you configure a PXF JDBC server for Hive access, you must specify the JDBC driver class name, database URL, and client credentials just as you would when configuring a client connection to an SQL database.

To access Hive via JDBC, you must specify the following properties and values in the `jdbc-site.xml` server configuration file:

| Property       |  Value |
|----------------|--------|
| jdbc.driver | org.apache.hive.jdbc.HiveDriver |
| jdbc.url | jdbc:hive2://\<hiveserver2_host>:\<hiveserver2_port>/\<database> |

The value of the HiveServer2 authentication (`hive.server2.authentication`) and impersonation (`hive.server2.enable.doAs`) properties, and whether or not the Hive service is utilizing Kerberos authentication, will inform the setting of other JDBC server configuration properties. These properties are defined in the `hive-site.xml` configuration file in the Hadoop cluster. You will need to obtain the values of these properties.

The following table enumerates the Hive2 authentication and impersonation combinations supported by the PXF JDBC Connector. It identifies the possible Hive user identities and the JDBC server configuration required for each.

Table heading key:

- *authentication* -> Hive `hive.server2.authentication` Setting
- *enable.doAs* -> Hive `hive.server2.enable.doAs` Setting
- *User Identity* -> Identity that HiveServer2 will use to access data
- *Configuration Required* -> PXF JDBC Connector or Hive configuration required for *User Identity*

| authentication  |  enable.doAs | User Identity | Configuration Required |
|------------------|---------------|----------------|-------------------|
| `NOSASL` | n/a | No authentication | Must set `jdbc.connection.property.auth` = `noSasl`.  |
| `NONE`, or not specified | `TRUE` | User name that you provide | Set `jdbc.user`. |
| `NONE`, or not specified | `TRUE` | Greenplum user name | Set `pxf.service.user.impersonation` to `true` in `jdbc-site.xml`. |
| `NONE`, or not specified | `FALSE` | Name of the user who started Hive, typically `hive`  | None |
| `KERBEROS` | `TRUE` | Identity provided in the PXF Kerberos principal, typically `gpadmin` | Must set `hadoop.security.authentication` to `kerberos` in `jdbc-site.xml`. |
| `KERBEROS` | `TRUE` | User name that you provide  | Set `hive.server2.proxy.user` in `jdbc.url` and set `hadoop.security.authentication` to `kerberos` in `jdbc-site.xml`.  |
| `KERBEROS` | `TRUE` | Greenplum user name  | Set `pxf.service.user.impersonation` to `true` and `hadoop.security.authentication` to `kerberos` in `jdbc-site.xml`. |
| `KERBEROS` | `FALSE` | Identity provided in the `jdbc.url` `principal` parameter, typically `hive` |  Must set `hadoop.security.authentication` to `kerberos` in `jdbc-site.xml`. |

**Note**: There are additional configuration steps required when Hive utilizes Kerberos authentication.

## <a id="hive_cfg_server_proc"></a>Example Configuration Procedure

Perform the following procedure to configure a PXF JDBC server for Hive:

1. Log in to your Greenplum Database coordinator host:

    ``` shell
    $ ssh gpadmin@<coordinator>
    ```

2. Choose a name for the JDBC server.

3. Create the `$PXF_BASE/servers/<server_name>` directory. For example, use the following command to create a JDBC server configuration named `hivejdbc1`:

    ``` shell
    gpadmin@coordinator$ mkdir $PXF_BASE/servers/hivejdbc1
    ````

3. Navigate to the server configuration directory. For example:

    ```shell
    gpadmin@coordinator$ cd $PXF_BASE/servers/hivejdbc1
    ```

4. Copy the PXF JDBC server template file to the server configuration directory. For example:

    ``` shell
    gpadmin@coordinator$ cp <PXF_INSTALL_DIR>/templates/jdbc-site.xml .
    ```
        
4. When you access Hive secured with Kerberos, you also need to specify configuration properties in the `pxf-site.xml` file. *If this file does not yet exist in your server configuration*, copy the `pxf-site.xml` template file to the server config directory. For example:

    ``` shell
    gpadmin@coordinator$ cp <PXF_INSTALL_DIR>/templates/pxf-site.xml .
    ```
        
5. Open the `jdbc-site.xml` file in the editor of your choice and set the `jdbc.driver` and `jdbc.url` properties. Be sure to specify your Hive host, port, and database name:

    ``` xml
    <property>
        <name>jdbc.driver</name>
        <value>org.apache.hive.jdbc.HiveDriver</value>
    </property>
    <property>
        <name>jdbc.url</name>
        <value>jdbc:hive2://<hiveserver2_host>:<hiveserver2_port>/<database></value>
    </property>
    ```

6. Obtain the `hive-site.xml` file from your Hadoop cluster and examine the file.

7. If the `hive.server2.authentication` property in `hive-site.xml` is set to `NOSASL`, HiveServer2 performs no authentication. Add the following connection-level property to `jdbc-site.xml`:

    ``` xml
    <property>
        <name>jdbc.connection.property.auth</name>
        <value>noSasl</value>
    </property>
    ```
    Alternatively, you may choose to add `;auth=noSasl` to the `jdbc.url`.

8. If the `hive.server2.authentication` property in `hive-site.xml` is set to `NONE`, or the property is not specified, you must set the `jdbc.user` property. The value to which you set the `jdbc.user` property is dependent upon the `hive.server2.enable.doAs` impersonation setting in `hive-site.xml`:

    1. If `hive.server2.enable.doAs` is set to `TRUE` (the default), Hive runs Hadoop operations on behalf of the user connecting to Hive. *Choose/perform one of the following options*:

        **Set** `jdbc.user` to specify the user that has read permission on all Hive data accessed by Greenplum Database. For example, to connect to Hive and run all requests as user `gpadmin`:

        ``` xml
        <property>
            <name>jdbc.user</name>
            <value>gpadmin</value>
        </property>
        ```
        **Or**, turn on JDBC server-level user impersonation so that PXF automatically uses the Greenplum Database user name to connect to Hive; uncomment the `pxf.service.user.impersonation` property in `jdbc-site.xml` and set the value to `true:

        ``` xml
        <property>
            <name>pxf.service.user.impersonation</name>
            <value>true</value>
        </property>
        ```
        If you enable JDBC impersonation in this manner, you must not specify a `jdbc.user` nor include the setting in the `jdbc.url`.

    2. If required, create a PXF user configuration file as described in [Configuring a PXF User](cfg_server.html#usercfg) to manage the password setting.
    3. If `hive.server2.enable.doAs` is set to `FALSE`, Hive runs Hadoop operations as the user who started the HiveServer2 process, usually the user `hive`. PXF ignores the `jdbc.user` setting in this circumstance.

9. If the `hive.server2.authentication` property in `hive-site.xml` is set to `KERBEROS`:
    1. Identify the name of the server configuration.
    2. Ensure that you have configured Kerberos authentication for PXF as described in [Configuring PXF for Secure HDFS](pxf_kerbhdfs.html), and that you have specified the Kerberos principal and keytab in the `pxf-site.xml` properties as described in the procedure.
    3. Comment out the `pxf.service.user.impersonation` property in the `pxf-site.xml` file. If you require user impersonation, you will uncomment and set the property in an upcoming step.
    3. Uncomment the `hadoop.security.authentication` setting in `$PXF_BASE/servers/<name>/jdbc-site.xml`:

        ``` xml
        <property>
            <name>hadoop.security.authentication</name>
            <value>kerberos</value>
        </property>
        ```
    4. Add the `saslQop` property to `jdbc.url`, and set it to match the `hive.server2.thrift.sasl.qop` property setting in `hive-site.xml`. For example, if the `hive-site.xml` file includes the following property setting:

        ``` xml
        <property>
            <name>hive.server2.thrift.sasl.qop</name>
            <value>auth-conf</value>
        </property>
        ```
        You would add `;saslQop=auth-conf` to the `jdbc.url`.
       
    5. Add the HiverServer2 `principal` name to the `jdbc.url`. For example:

        <pre>
        jdbc:hive2://hs2server:10000/default;<b>principal=hive/hs2server@REALM</b>;saslQop=auth-conf
        </pre>
    6. If `hive.server2.enable.doAs` is set to `TRUE` (the default), Hive runs Hadoop operations on behalf of the user connecting to Hive. *Choose/perform one of the following options*:

        **Do not** specify any additional properties. In this case, PXF initiates all Hadoop access with the identity provided in the PXF Kerberos principal (usually `gpadmin`).

        **Or**, set the `hive.server2.proxy.user` property in the `jdbc.url` to specify the user that has read permission on all Hive data. For example, to connect to Hive and run all requests as the user named `integration` use the following `jdbc.url`:

        <pre>
        jdbc:hive2://hs2server:10000/default;principal=hive/hs2server@REALM;saslQop=auth-conf;<b>hive.server2.proxy.user=integration</b>
        </pre>

        **Or**, enable PXF JDBC impersonation in the `pxf-site.xml` file so that PXF automatically uses the Greenplum Database user name to connect to Hive. Add or uncomment the `pxf.service.user.impersonation` property and set the value to `true`. For example:

        ``` xml
        <property>
            <name>pxf.service.user.impersonation</name>
            <value>true</value>
        </property>
        ```
        If you enable JDBC impersonation, you must not explicitly specify a `hive.server2.proxy.user` in the `jdbc.url`.
    6. If required, create a PXF user configuration file to manage the password setting.
    7. If `hive.server2.enable.doAs` is set to `FALSE`, Hive runs Hadoop operations with the identity provided by the PXF Kerberos principal (usually `gpadmin`).

10. Save your changes and exit the editor.

11. Use the `pxf cluster sync` command to copy the new server configuration to the Greenplum Database cluster:
    
    ``` shell
    gpadmin@coordinator$ pxf cluster sync
    ```

